home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Sound Fx
/
Sound Fx.iso
/
Software
/
UNZIPED
/
DENOISE
/
README.txt
< prev
Wrap
Text File
|
1997-02-20
|
11KB
|
214 lines
_______________________________________________________
D/Noise 1.0d
A Digital Audio Denoising Tool
_______________________________________________________
Windows 95 version
(C) 1996 Fast Mathematical Algorithms and Hardware Corporation,
1020 Sherman Avenue, Hamden, CT 06514.
<http://www.fmah.com>
_______________________________________________________
INTRODUCTION
This demonstration version is meant to illustrate some of our current
work in the area of audio signal processing. It is in no way suited for
commercial denoising. This version will only operate on monophonic
16-bit WAV (Audio Interchange File Format) files. In addition, there
is a limit on the size of the input file of one million sample points.
This version of D/Noise does not support algorithm iteration, i.e.,
the denoising algorithm makes only a single pass through an audio file,
separating it into what it thinks is coherent and what is noise. In the
next release, you will be able to specify how many times the algorithm
will pass through a file and its components to achieve a more thorough
separation. In addition, you will be able to save a compressed version
of the denoised file as well as well as apply some basic pre- and post-
processing transforms.
_______________________________________________________
Installation
----------------
This distribution consists of 4 files which should stay together in one
folder:
(1) Preliminary documentation (this file)
(2) Dnoise.exe, the shell used to run the algorithm
(3) Denoise.dll, the denoising algorithm in a dynamic link library
(4) Caruso.wav, a sample audio file containing a snippet of Enrico
CarusoÆs singing, recorded in 1904
Opening a WAV file for denoising
-------------------------------------------------
D/Noise performs a one-pass denoising procedure on an open WAV
file. To open a file:
[1] Select "Open..." from the "File" menu.
[2] Locate and open a file using the standard file dialog.
You can run the denoising procedure on the entire file or just a short
segment of it. To select a segment of your source file, click-drag across
it with the mouse. The toolbar along the top of the main window has a
couple standard controls for scrolling the wave form representation of
the file, as well zooming in and out.
The smallest length of the signal you can select is determined by the
control at the right end of the toolbar at the bottom of the main
window. This length also determines the size of the sliding signal
window used in the denoising procedure. A length of 1,024 sample
points is usually adequate. [NB: the other controls in the bottom
toolbar are not functional yet and appear disabled.]
Setting denoising parameters
----------------------------------------
The outcome of the denoising procedure depends on the settings of
various parameters. The exact meaning of these parameters is
explained at the end of this document.
To open the denoising algorithm interface, select "Configure..." from
the "Denoise" menu.
You can select one of two default parameter sets or enter your own.
To select a default set, click on the "Default 1" or "Default 2" button in
the "Parameters" frame.
You can also set your own parameter values. You can use the [tab]
key to jump from one box to the next. You will get an error message if
you try to enter a value outside the range of a specific parameter.
Running the denoising procedure
-----------------------------------------------
[1] Setting the output files
The denoising process will leave your original input file untouched
and generate two new files. The first of these two new files will
contain the coherent ("clean") component of the source file and the
second will contain the noisy component. In an extended procedure
you could run the process on the noisy file again to extract even more
coherent parts and add those to the first clean file. This version of
D/Noise does not yet support this type of iteration (although you can
do this "by hand"). In the next release, you will be able to specify a
number of iterations for the algorithm.
Use the Select... buttons to select names and locations for the two
output files [Hint: if your files are fairly small and you have RAM to
spare, you may want to put the output files on a RAM disk to speed up
the process and minimize disk thrashing. You will need the same
amount of storage for each the coherent and the noisy file as you need
for your source file].
[2] Starting the procedure
Click the [Denoise All] or the [Denoise Selection] button at the
bottom of the dialog box. The procedure starts and progress
information is displayed. You can abort the procedure at any time by
clicking the [Stop] button. Note, that it may take a little while before
the algorithm stops, as event polling is kept at a minimum in order not
to slow down the process. When finished, close the dialog box by
clicking the [Done] button. You can now open and see/hear the
resulting coherent and noise files.
_______________________________________________________
About the D/Noise Algorithm and its Control Parameters
by Maxim J. Goldberg and Igor Popovic
_______________________________________________________
INTRODUCTION
The D/Noise family of algorithms was developed for the purpose of
removing noise from one dimensional signals, in particular, speech or
music signals, by the method of denoising proposed by R. Coifman
and V. Wickerhauser. One starts with a library of orthonormal
waveforms, which typically includes wavelet packets and local
trigonometric bases. A signal is expanded in each basis, and a cost
assigned to the expansion. The basis giving rise to the least cost is
chosen, the coefficients are ordered by magnitude, and a number of the
leading terms is kept as the coherent part based on a predetermined
threshold cost of the remaining terms. These leftover terms constitute
by definition the noisy part of the signal, and can be treated as a new
signal which can in turn be expanded and separated into its coherent
and noisy components.
In D/Noise, we use only one library of bases, those arising from the
dyadic decomposition tree obtained by constructing local sines on the
frequencies of a smoothly cut window from the signal. A "best" basis
is chosen by comparing the cost of a parent node to the sum of the
costs of the 2 children. In D/Noise, the cost function can be chosen to
be Shannon entropy or the lp of the coefficients of an expansion. We
attempt to deal with numerical artifacts arising from the processing by
(1) allowing shifts in time and frequency, and (2) by segmenting into
large windows and only using the uncorrupted middle core. The large
window we are using is 4 times the size of the core. For example, if
the user selects a signal window of 1,024 samples, internally we slide
and denoise a window of 4,096 samples and use only its 1,024 wide
core in the reconstruction. This strategy has proven to give more
pleasing results than any other "fancy" windowing.
PARAMETERS
(1) Window size
This parameter determines the number of consecutive samples
processed at one time. Internally, the algorithm slides two "windows"
of the selected width through the signal, offset by 1/2 their width. In
addition, each window is extended to both its sides and only the core is
used in the reconstruction after denoising. The windows should not be
too narrow, since good frequency resolution is desirable, in particular
for music. Nor should the windows be too wide, since information
spread over time might mask local occurrences. For music, it seems
that a choice of 512, of 1024, or perhaps 2048 are the sizes to consider
first.
(2) Log2 of reach
This is the log-base-2 of the size of the smallest interval to be
considered by the local trigonometric transform decomposition tree.
For example, the preset of 4 will give you 2^4=16 samples as the
smallest interval to be considered.
(3) Energy threshold
This is the energy threshold for discarding coefficients from the
extracted signal basis. This number, typically .0001, or .000001,
means that in the chosen basis, those coefficients of size less than
(energy threshold) * (energy of window segment) are set to zero and
thus discarded.
(4) Entropy
A real number alpha, to determine which entropy function will be used
to separate out the noise component: alpha = 0.0 is Shannon entropy, 0
< alpha < 1 stands for little-l-sub-p norm, where p is 2*alpha. For
example, entering 0.5 will result in l1 norm being used.
(5) Entropy ratio
This real number specifies the threshold mentioned in (3) above. A
ratio of 1.0 or higher means that all the entries of expansion of the
window segment will be considered to be coherent, while a ratio of 0.0
or less means that the entire signal coming from each window will be
considered to be noise. For music, a good testing entropy ratio may be
between 0.3 and 0.4 if using Shannon entropy (alpha = 0.0 in (4)
above); 0.7 works well for alpha = 0.5.
(6) Time shift
A specific value k means the signal is padded with k zeros in front,
the whole program is run, and then the output files are shifted back to
the left by k samples. The purpose of different shifts in time is to have
the signal window cuts to occur in different places. It is recommended
that any shifts chosen be prime, or nearly prime numbers, without
high powers of two occurring in their factorization, and each shift is
less than one half of the window size set in (1) above.
As mentioned previously, this version does not yet support any type of
iteration. If you run the algorithm on the same file specifying a
different time shift on each run, you will have to average the resulting
files by hand, i.e. using some audio file mixing utility (a free utility
package for AIFF files will be included with the next release).
(7) Frequency shifts
In this field you can enter up to 9 integer numbers, each specifying
a shift in the frequency domain of the signal. As in (5), the purpose is
to average out cutting artifacts from the spectrum when performing the
adapted local trigonometric transform on the signal's frequencies.
Small primes are recommended, the default presets should suffice.